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Abstract 

The birth/death process with mutation describes the evolution of a 
population, and displays rich dynamics including clustering and fluc- 
r^ , tuations. We discuss an analytical 'field-theoretical' approach to the 

^ - birth/death process, using a simple dimensional analysis argument to de- 

scribe evolution as a 'Super-Brownian Motion' in the infinite population 
^— V ^ limit. The field theory technique provides corrections to this for large but 

finite population, and an exact description at arbitrary population size. 

j^ ■ This allows a characterisation of the difference between the evolution of a 

H ' 

C^ ' phenotype, for which strong local clustering is observed, and a genotype 

for which distributions are more dispersed. We describe the approach 
with sufficient detail for non-specialists. 
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1 1 Introduction 

2 Throughout the biological literature, the term "diffusion in genotype space" 

3 is used to describe a population acting under genetic drift in the absence of 

4 selection. This is not diffusion at the individual level, but at the population 

5 level, where the individuals form clusters resembling a species, the mean position 

6 of which performs a random walk i.e. diffuses. The 'species' may consist of a 

7 number of clusters at any given time. However, these clusters remain close 

8 together, and the species is limited in the maximal width that it can achieve in 

9 any given direction in the space [T]. 

10 Clustering phenomena are well understood for reproducing and dying organ- 

11 isms dispersing in real space [H [3] . In the case of real space, the relationship of 

12 the microscopic process to the stochastic Partial Differential Equation (PDE) 

13 formalism is clear, due to the (exact) field theory mapping 4 of the under- 

14 lying microscopic process to a stochastic PDE. However, no such translation 

15 has been done in the case of type (sequence) space, be it of a genotype or a 

16 phenotype, where clustering phenomena are also observed [2 [HIS]. We perform 

17 the translation and show that reproducing and dying organisms either diffusing 

18 in real space or mutating in type space are fundamentally the same process in 

19 an infinite population. This equivalence only applies in the infinite population 

20 limit, and so we provide finite size corrections to the stochastic PDE, allowing 

21 for individuals to mutate only on a birth event. 

22 From known results for diffusing organisms, there exists an 'upper-critical 

23 dimension' dc, above which general 'mean field' results hold but below which 

24 the behaviour is different. A phenotype forms a one dimensional type space, 

25 which can be thought of as a single trait. Conversely, we consider a genotype 

26 as a very long amino acid string, and hence genotype space is high dimensional 

27 as mutations are free to occur at a large number of independent positions. 



28 Therefore there exists an important distinction between the evolution of a given 

29 phenotype, and a genotype. The theory of Critical Branching Processes 'T, finds 

30 that in high dimensions describing genotype space (d > dc, where in our case the 

31 critical dimension dc = 2 [Hj), birth/death dynamics are described fully by the 

32 lineages. A lineage remains distinct until all individuals in it die. However, in 

33 low dimensions {d < dc) describing a particular phenotype, additional clustering 

34 within the distribution of the lineage occurs. Although sometimes distinct, the 

35 clusters in phenotype space can merge, and hence clusters are poorly defined 

36 entities. Instead, a careful average over the distribution called a 'peak' provides 

37 a more useful description of phenotypes [T] . 

38 Critical Branching Processes have a total population that does a random 

39 walk and only surviving lineages with N{t final) > are considered. For real 

40 space birth/death processes, the same phenomological clustering and upper- 

41 critical dimension dc = 2 are also found when considering systems of fixed 

42 population size [51 . As we show that neutral evolution has the same description 

43 in the infinite population limit as Critical Branching Processes in real space, 

44 this result also applies to evolution, and for qualitative studies we can choose 

45 whether to consider systems of fixed or fluctuating total population. 

46 We use the technique of second quantisation of a master equation and map- 

47 ping to a field theory [4], for which most previous work focuses on the infinite 

48 population limit. However, field theory is a good tool for obtaining analytical 

49 results for finite and changing population sizes, as is the case for real popula- 

50 tions. The technique was developed in the setting of reaction-diffusion systems, 

51 where particles diffuse continuously, unlike our case of diffusion in a type space 

52 where the diffusion only occurs on a birth or death event. 

53 This paper addresses three main issues: 

54 1. We discuss how a microscopic model of evolution can be represented as a 



65 Field Theory, and derive the stochastic PDE that follows in this case. 

56 2. Understanding the 'asymptotic' behaviour, i.e. the infinite population 

57 and long wavelength properties of evolution, by identifying that this is a 

58 solved problem. Super Brownian Motion and Critical Branching Processes 

59 provide a wealth of results which are made explicit by the comparison of 

60 the relevant stochastic PDEs. 

61 3. Relating the asymptotic behaviour to the behaviour at finite population 

62 size. 

63 1.1 Known results for clustering due to birth/death pro- 

64 cesses 

65 We will use known results of birth/death processes from the field of Critical 

66 Branching Processes, which tackles similar problems to field theory techniques 

67 but differs in approach and terminology. Refs. 'J^ and [S] give more details, and 

68 a more direct technique is used in P] . 

69 As discussed above, the behaviour is qualitatively different above and be- 

70 low a critical dimension dc- In the language of statistics, for d < dc the only 

71 stable solution is a do measure in the correlations between individuals - i.e. all 

72 individuals are fully correlated in their position. This implies that all individ- 

73 uals are localised in space so that their distribution collapses to a point when 

74 viewed at very large length-scales. In unsealed type space, this corresponds to 

75 a local peak with a characteristic width (s) i.e. a length-scale. The length-scale 

76 s scales to in the infinite population limit for d < dc. However, for d > dc 

77 other non-trivial measures exist describing the correlations in the system, which 

78 correspond to a distribution of individuals spread out over a number of lineages. 

79 The distribution of time since last ancestor forms a power-law distribution in 

80 the infinite population limit, and since lineages typically do not intersect in high 



81 dimensions, there is no characteristic width to the distribution (and hence no 

82 length-seale) . 

83 The theorems available for the clustering process are usually devoted to 

84 deriving general behavioural properties, such as the convergence to either of 

85 the above measures in various dimensions. Many clustering phenomena are 

86 described by the same scaling relations for d > d*, where i is a label for a 

87 particular phenomena (e.g. birth/death processes in Euclidean space, as our 

88 model). Thus each model may have a different critical dimension, but above 

89 that the scaling behaviour is the same. Examples are given in 7 : Galton 

90 Watson Trees embedded into space (which is the real space diffusion version 

91 of our model) have dc = 2, Directed Percolation [1] has dc — 4, Percolation 

92 has dc = Q and Lattice Trees (a lineage tree embedded in a lattice so that 

93 separate lineages never meet) have dc ^ S. All dimensions refer to the number 

94 of spatial dimensions - the stochastic process consists of the extra dimension of 

95 time. Below the critical dimension each model behaves specially, however above 

96 the critical dimension all models follow the same scaling relation. All of these 

97 models can be described as a birth/death process embedded in some type of 

98 space. 

99 Super Brownian Motion is the limiting process of all of the above processes 

100 for d > dc- This can be described by a stochastic PDE in many cases. In the 

101 case of Galton- Watson Trees [10] , in terms of the density p as a function of space 

102 X and time t, the stochastic PDE is: 

^^^=I?v'p(x,i)+cVKM)^(x,t), (1) 

103 where D is the diffusion constant and c is a constant describing the magnitude 

104 of the noise. We will obtain this functional form in the infinite population limit 

105 for the case of evolution in type space, which with some dimensional analysis 



106 means that evolution as we define it has dc = 2. 

107 1.2 The model 

108 We consider N{t) individuals at time t, and each individual has a discrete type 

109 X G Z'^ , which it retains during its lifetime. The dimension of type space d is ar- 

110 bitrary in the formalism. A timestep consists of performing a birth attempt with 

111 probability Poif/(Poif +Pkiii), or otherwise a killing attempt occurcl- We focus 

112 on the case pkui = Poff throughout this discussion. Time is measured in genera- 

113 tions and increases by the average waiting time between events, l/N{poff +Pkiu) 

114 per timestep. 

115 • Birth attempt: A parent individual (with type x) is selected at random, 

116 and an offspring is created with type x and mutated with probability pm- 

117 A mutation involves x changing by ±1 in a randomly chosen direction. 

118 • Killing attempt: A randomly chosen individual is removed. 

119 This definition differs from the case of a birth/death process with diffusion 

120 in real space, where all individuals are diffusing constantly between birth/death 

121 events. Our model permits 'diffusion' only on birth events via mutation. A 

122 sample run is shown in Figure ([1]) for an early and late time, and a sample 

123 distribution from diffusion is shown for comparison. Simulations at different 

124 N show that there is a clustering behaviour that persists regardless of N (not 

125 shown, though see T). We wish to understand how the clustering depends on 

126 dimensionality, both qualitatively and quantitatively. 



*This is an example of the Gillespie algorithm |11| . 
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Figure 1: Sample distribution for N — 10000 individuals starting at and evolv- 
ing in a 1-dimensional type space. For early time, (black line) the distribution 
behaves similarly to diffusion, but once the peak has become 'large', it begins 
to move around, and can split up into a number of clusters (as shown by grey 
line). The distribution for A^ = 10000 diffusing particles \T\ is also shown (a 
normal like distribution centred at zero, dotted line) at early time; the width 
increases as yi. 



127 2 Field theory approach to obtain a Stochastic 
PDE 

129 Doi's process of second quantisation [12] is used to obtain a Field Theory from 

130 a Master Equation. A detailed description of the mapping process is presented 

131 in [J, and a detailed background can be obtained from |13) . 

132 2.1 Outline of the method 

133 We outline the method for obtaining a field theory from a Master Equation of 

134 the form dP{{n},t)/dt — f{{n}). Here P{{n},t) is the probability distribution 

135 of the state {n} — {rii, • • • ,ni, ■ ■ ■ ,ni}, where L is the size of the type space 

136 and Hi is the population size of a given type i. 

137 1. Define the state |{n}), and use the equation for dP{{n},t)/dt to obtain 

138 an equation of motion for |{?t.}). 
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139 2. Define \{(j)}) as the superposition of all possible states with probabilities 

140 P{{n},t). This provides an equation of motion of the form d\{(f>})/dt = 

141 —iJ| {(/)}), where H is called the Hamiltonian. 

142 3. The state \{(l)}) is expressed in terms of operators; the field 0(x, t) emerges 

143 as a natural quantity in the system, being the eigenvalue of the so called 

144 'creation' operator, which counts the number of individuals and hence is 

145 related to the density. The field '?!>(x, t) is simply a complex number defined 

146 for all spatial points x. 

147 4. By taking the continuous space limit, the equations for (/)(x, t) become 

148 tractable. 

149 5. Observables A can be related to cj) in terms of functional integrals. How- 

150 ever, most quantities of interest such as the density can be obtained di- 

151 rectly from examination of the Hamiltonian alone. 

152 6. A stochastic PDE in our case can easily be obtained from H, providing 

153 access to other techniques or, in our case, previous results. 

154 We find that additional terms appear in the field theory due to the insis- 
ts fence that movement only occurs on a birth; these are difficult to deal with 

156 in analytical calculations. These terms are negligible in the infinite population 

157 limit, in which case our model reduces to the real space birth/death process 

158 with diffusion of individuals. This is a previously solved case, as discussed in 

159 Section 11.11 We are also concerned with the finite population case, for which 

160 we provide both an exact and an approximate stochastic PDE. 

161 2.2 To Second Quantised form 

We now follow the procedure of second quantisation of a Master Equation, 
providing explicit details only for the one-dimensional case for readability. The 



starting point is the master equation for the Evolution process as defined above. 
The equation for the change in the probabiUty distribution p{{n}) over all sites 
{n} — (tt-i, • • • ,ni,- ■ ■ , riL-i, n^) = n^, in notation enumerating only sites dif- 
ferent from {n} for brevity, is: 



Poff 

2N 



Atp{ni;t) = 

y^\ni[pjnP{ni-i - l;t) + PmP{ni+i - l;i) - 2p{ni;t)] 

+ 2{l-p„,){n,-l)p{n,~l;t) 

+ 2(n, + l)p(n, + l;i) - 2n,p(n,;; t)|. (2) 

162 Eq. ^ follows directly from a microscopic description of the model. We sum 

163 over all possible lattice points i where a change could occur. The terms from 

164 left to right are, on the top line: enter state {n} by a birth at i mutating left; 

165 mutating right; leaving state {n} by a birth at i. Second line: entering state 

166 {n} by a birth without mutation. Third line: entering state {n} by a death at 

167 i, and leaving state {n} by a death at i. We have ignored boundary terms as 

168 we will take L — > oo. 

169 The state \ni) of a lattice point i is defined as the number of individuals on 

170 it. We then define the state of the system \{n}) = |ni) (8) • • • (8) |hl), where (S) 

171 denotes the outer product. 

172 Eq. ([2|) is multiplied by |{n}), and we then relabel the states within the 

173 sum to ensure that the probabilities in all terms are expressed in terms of 

174 p{{n}), allowing the 'state' vector to become different from |{n}). Then we can 

175 define operators acting on the state in order to recover all terms in the summed 

176 state |{n}). These operators also capture multiplicative terms in the number of 

177 individuals Ui. We define the operators, called the annihilation operator a and 



178 



the creation operator a^, by their commutation relations: 



[a^,a]]^S^J, (3) 

[a„aj] = [aj,a]]=0. (4) 

179 The notation [a^, a A means simply aicd — ciMi. If we define the 'vacuum lattice' 

180 |0) by flijO) — for all i, and \ni) — (ai)"'|0) then it is simple to show that the 

181 operators follow: 



(5) 

a||n,) = |n, + 1). (6) 

182 On multiplication of Eq. ^ by the state |{?i}), summation over all states {n}, 

183 performing the relabelling and using the creation and annihilation operators, 

184 we find: 



^t'^p{{n}]t)\{n}) = -^^^p({n};t)|p™aj_iaja^+p™aj_iajai-2a 

+ 2(1 -p„,)(aj)2a, + 2a, - 2a\a,^\{n}), 



" CLi 



{»} {"} 



(7) 
Or we can write this in (quasi) Hamiltonian form, using the notation |{0}) = 

A,|M) = -iJ|{0}), (8) 

with the Hamiltonian: 
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^ - If E [-Pmiali + «li - 2al)aja, - 2(4 - l)2a,] . (9) 

i 

188 This completes the mapping to second-quantised form. 

189 2.3 Prom Second Quantisation to Field Theory 

190 The next step involves constructing so called 'coherent states' such that ai\(f)) = 

191 4>i\4>), and {(j)\d\ = (01 0^ as described in [J. The eigenvalues (j)i and 0* of 

192 the a and a) operators respectively are complex numbers at a given point, and 

193 therefore in the continuous limit form a field (j) in space x. This allows one 

194 to calculate observables by use of a projection state. Here we simply use the 

195 results that a^ — > (/>i , aj — > 0* . The continuous limit is then taken by allowing 

196 ^^ — > J h^^dx, (pi —> 0(x, t)h and (j)* — > 0(x, i), where we take h (the distance 

197 between lattice points) to zero. It can be shown that {(f>) = {n{x,t)/N). We 

198 consider the 0(x, t) and 0(x, t) fields to be independent. This completes the 

199 mapping to a field theory, in terms of the Action in the Statistical Weight: 



S[4>, (A] ^ I d^xl -<P{tf) - 0(O)[1 - no] + / ' \~^dt(l> + H{,j), ^)1 dt \ , (10) 



200 expressed in terms of final time tf, and the average initial occupancy ng. The 

201 Action is related to the expectation of an observable A by: 



A{t)=N-' j f lim^np^.P^n A{{4>}t)eM~S{W}A^})l]- (H) 

We have introduced a normalisation factor M and the path integral notation 
Vcpi T . Path integrals of the form of Eq. (fTTj) have been well studied and we will 
discuss some of methods available to avoid performing explicit integration, by 
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205 considering the Action S directly. The Action depends only on the Hamiltonian 

206 which we have derived from the Master Equation above. Following this process 

207 for our case of neutral evolution from Eq. ^ we have: 



Hicj,, 4>*)^?^Y1 [-Prn{<t>U + <t>:+l - 2<|>*m^^ ~ "^ir. - lr^^] , (12) 



or in the continuum limit: 



H((h.(h) ^ D (fx 



^(V'0(x,i))^(x,t)0(x,t) - — (0(x,t) - l)20(x,i) 

(13) 

209 We have introduced the Diffusion Constant D = {poffPm/'2'){h'^ /Ndt), which is 

210 kept constant when taking the limit. The distance between types is h and dt is 

211 the timestep. This equation is recovered for any dimensionality of Eq. ^. We 

212 will use the notation (/)(x, t) — (j) where this is unambiguous. 

213 The 'classical solution' to Eq. P^ is obtained by considering only terms at 

214 most bilinear in (j) and 0, corresponding to the noiseless case. This has (j) = 1, 

215 as is easily checked using the methods from Section 12.61 Therefore it is useful 

216 to perform a field shift ^ (/) + 1 to obtain a neater Hamiltonian: 



H{(j),(j)) = D I d'^x 



— — 2—2 

Pm 



(14) 



217 The variable we are working with here, 0, does not correspond directly to the 

218 real density we measure, although the expectation value for the two is the same. 

219 The density |14| is (fxp — p, which can be obtained directly by defining <j) — e^ , 

220 and (j) ~ pe^''; p is a real valued field. This allows us to obtain an explicit 

221 equation for the density; however, the exponentials must be considered as their 
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222 sum expansion, and in our case they do not cancel. We will later be able to show 

223 that the higher order terms are progressively less important when the population 

224 is large. Writing the less important terms within sums, the Hamiltonian for the 

225 real density for evolution in a type space is therefore: 



J ^ Pm 



PP' 



226 We will use two different Hamiltonians in the following analysis, -ff ((/), (j)) for the 

227 complex field, and H{p,p) for the real density field. Note that the over-line (j) 

228 notation refers to variables in the shifted field, not an average. 

229 2.4 Noise in Field Theory 

230 An equation for the time development of the distribution of particles is obtained 

231 by taking the functional derivative (see e.g. [I3]jj of the Action S with respect 

232 to the complex field (j). Conversely, the equation for (j) is obtained by functional 

233 derivation of the Action with respect to 0, which often gives a pathological 

234 equation for (/)(t). It is however possible to remove (p from the stochastic PDE 

235 for (j) when the Action is quadratic in (f), by linearising the Action in (j). To do 

236 this we introduce an auxiliary field 77, which will correspond to a noise field. To 

237 see why ?7(x, t) is a noise field, consider a single point in the field. Suppose 77 is 

238 Gaussian, uncorrelated noisq^ with unit variance such that {ri{:x.,t),ri{x' ,t')) = 

239 (5(x — x', f — t') and p{r]{x, t)) = e~^ /^/\/27r, then the Fourier transform of this 



tNote that this is the reverse of the standard method to obtain a field theory from a 
stochastic PDE representation |15) . and is simple to do in practice. 

''Many authors prefer to incorporate the variance and correlations into the noise, defining 
correlators (r;(x, t), ri{x' , t')) that absorb all noise terms and their cross-correlations, which is 
the appropriate form for further calculations. For clarity, we instead keep the noises simple and 
retain the explicit magnitudes, but must be careful to combine the noise terms for calculations. 
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240 is: 



-^ e^'''/2e-^2.fc^d77 = e-2-''='. (16) 

V27r J_oo 

241 Therefore, by writing q = v^Trfc: 

e"?' = -^ / e-''"/^e-'^''^dr^. (17) 

242 We can equate q with (a convenient form of) the field at a particular point 

2 

243 (x, t). For example the final term in Eq. (fH|) is of the form —acf) (and H 



244 appears with an additional — sign), so we can identify q = (t>^/~a(f> and the 

245 term translates with noise to (j)^/2a4>ri, where rj is Gaussian uncorrelated noise. 

2 

246 In this sense, the field exp(— (/) ) represents the 'integrated out' form of the noise. 

247 We proceed to calculate the linearised version of the noise for the above 

248 problems, with terms appearing in the formalism as exp(— _ff). We can replace 

249 the form q^ with —i\/2qr] in S, which if q is already in the form cj) will give us 

250 the result immediately, or we can rearrange the result by parts. If q^ is negative 

251 (i.e. the original term was negative in H), this will lead to a real noise term, 

252 and conversely an imaginary noise term if q^ is positive. Firstly, we rearrange 

253 one noise term found in Eq. (|14p using integration by parts into an appropriate 

254 form: 

- I?# v' ^ = -(^/2)^' V' </> + ^<^(V0)'- (18) 

255 Noise terms normally cannot be decomposed without consideration of cross- 

256 correlations. Explicit consideration of correlations is complicated in this case as 

257 the (f) term appears within a gradient operator in some terms, but not in oth- 

258 ers. Fortunately, we can perform decomposition to real and imaginary parts; 

259 although care must be taken [16j to determine the relative importance of com- 
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260 bined real and imaginary noises, we can separate the real components (i.e. neg- 

261 ative terms in H) and imaginary components (i.e. positive terms in H) in Eq. 

262 (mi) using Eq. dH 



HU, (t))^D d'^x 



>v'^ + c^ivW-~"^^' 



2 p.„ 



(19) 



263 Since the noise terms appear as exp(— iJ), so are the opposite sign to how they 

264 appear in H , they transform as follows: 





(20) 


V%!/)-V^(v^^%>)^- 


(21) 



D(l)^ 



265 Eq. ([2T]l also appears in the equation for the real noise from Eq. (fT5|) , using the 

266 n — 2 term from the sum; the translation to a noise field is only valid when the 

267 remaining sums are discarded as there would be correlations to consider with 

268 the higher order terms. Additionally, the imaginary Eq. (|20p term is absent 



as the p field is constructed to be strictly real. Also, we will later need the 



—2 

270 linearised form for the (h (h term: 



^_R^f ^ 2j£l^r,. (22) 

Pm \ Pm 



271 The above fields can be simply described. Eq. ([20|) represents so called 'diffusive' 

272 noise (in the imaginary axis, for our case) - it is conserved (the differential 

273 ensures that what goes in at one point comes out at the next) and decays 

274 with (p as \/0 (recall {(/>) — {n{x,t)/N)). Eq. ((22|) describes 'square-root' (in 
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275 magnitude) multiplicative noise. Because it is non-conservative, multiplicative 

276 noise in general can have dramatic effects on the behaviour of the system. We 

277 call Eq. (j2ip 'mutation noise', as it arises from movement only on a mutated 

278 birth event. 

279 Representation of mutation noise as a stochastic PDE must be done carefully, 

280 as the term inside the square root may become negative when the gradient is 

281 large and negative. This is not physical and is due to representing the evolution 

282 process as continuous field theory, then forcing the field theory into a stochastic 

283 PDE. Consider the discrete representation of the term inside the square root in 

284 Eq. dUD: 



V^ p(x, t) + — p(x, t) = p(x + 1, t) + p(x - 1, f) + 2 ) p(x, t). (23) 

Pm \Pm 



285 This is positive definite since pm < 1- We therefore impose the extra constraint 

286 that p{x+ l,t) + p{x— l,t) > v^/9(x, t) > — 2p(x, i) on mutation noise. This 

287 can be achieved by using 8(v^/o(x, i) + 2p{x,t)), where 6(y) is the Heaviside 

288 step function; 8(y) = for y e (— oo,0) and 0(j/) — 1 for y E (0,oo). This 

289 ensures positivity but the convergence properties as Ax — > are not currently 

290 known. 

291 2.5 Dimensional Analysis 

292 In order to establish which terms are important for the 'large scale' behaviour 

293 of the system, dimensional analysis can be used. This involves considering the 

294 contribution of terms at different scales by assigning dimensions to the con- 

295 stants (called coupling constants) of each term, and ensuring that the equations 

296 are dimensionally consistent. The system is then rescaled and the constants 

297 will change according to their dimensions. We consider the 'long wavelength' 
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298 limit, so that distance scales are much longer than any lattice spacing and the 

299 'fine structure' is averaged out. The fine structure of models will depend on 

300 details such as the definition of the lattice and the exact form of mutations (i.e. 

301 whether strictly nearest neighbour or with some short ranged distribution such 

302 as exponential). Fine structure is lost in the dimensional rescaling, but many 

303 models have the same phenomological description, i.e. are described identically 

304 at the macroscopic scale of large wavelengths. Thus the rescaling can result 

305 in significantly simpler models in which only the most important details are 

306 retained. 

307 The following results will hold in the asymptotic limit of large population 

308 N, and apply only to the description at large scales. For all finite N there will 

309 be a clustering length scale. Together with k, there would be two length scales 

310 in the problem and the appropriate dimensions for the coupling constant cannot 

311 be uniquely determined, hence dimensional analysis cannot be applied. 

312 As all terms in the Action S given by Eq. (jlOp appear in an exponential 

313 (in Eq. (jlip ) they must be dimensionless. We define a wavevector k — h^^ 

314 as our unit of measurement (with h a small length scale). Each term in the 

315 Hamiltonian H contains an integral of dimensions K~'^t~^ {d is the dimension of 

316 space), hence each term must have a spatial dimension of k'^ and time dimension 

317 t. Scaling of space will extract the relative importance of the terms in H; time 

318 scaling must then be performed to ensure that the equation retains the time 

319 derivative term in the Action S with the dominant term(s) from H . The time 

320 scaling is not of interest in this case and we will not consider it further. 

321 There is an arbitrary choice when defining the dimensions of cf) and (f), pro- 

322 vided that [(/>(/)] — k'^; however, there is a 'natural' choice, meaning a choice in 

323 which the scaling dimensions of the terms is correct. We will identify the nat- 

324 ural choice for our case in Section 12. 6[ but progress can still be made without 
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325 assuming a particular dimensionality of (jj as some terms are irrelevant in the 

326 limit K — > oo regardless of the dimensional assignment. 

327 We will be rescaling to large k, and hence only the highest order terms 

328 in K are 'relevant', as they will dominate the effective equation at large k. 

329 We consider D dimensionless, but introduce a coupling constant on all terms 

330 that contains all dimensional components; this constant has magnitude 1 in the 

331 original, unsealed system. The derivative y has scaling dimension [y] = k^. 

332 The dimension of the fields are [0] = k'^^'^ and [(p] = k'^, where e is the parameter 

333 that controls the relative dimensions of the field and must be in [0, d]. 

334 We proceed with an analysis of the dimensions of the coupling constants. 

335 Performing the full Renormalisation Group analysis [4] would explicitly perform 
the rescaling, providing details of how the scaling occurs and giving the natural 
choice of e as a by product. We don't perform this analysis, but instead consider 
all possible values of e for now, and use previous results to identify the correct 
choice. The coupling constants introduced will be called a^, where i is just a 

340 label. The terms of interest in H{(p, cj)) from Eq. (fT4|) are: 



336 



338 



339 



=^ [ai] - n-^, (24) 



y^^]=K^+''+^[a2]=^' 



=> h] = K-^-"". (25) 

=> [a3]^n-\ (26) 
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341 Recalling that e G [0,d], hence k~'^ > k~'^, we can conclude the following. 

342 If we assume e = 0, then Eq. (P5|) dominates both Eq. pi)) and Eq. (^51) . 

343 If we instead assume e > 0, then Eq. (|24p dominates Eq. (I25p . although the 

344 importance of the remaining terms cannot be determined without knowledge 

345 of e. Therefore the term (/>(/) y^ (f> from Eq. (P5)) can be discarded, and the 

346 Hamiltonian Hq for infinite N can be written: 



HM.d))^ / rf'^a; 



- , 2D-2 

-D4)SJ^ (j) (/< (, 

Pm 



(27) 



347 Some of these terms are also present in H(p, p), but there are additional terms 

348 that appear when considering the sums in Eq. (jisp , where we have the minimum 

349 summand variables to = n = 2 in the following terms: 



=^ [fe„] = «;-(2"-l)^ (28) 



[CnP" V P] = H 



Cn.l — K, 



=^ [bn] = n-^-^^~^^\ (29) 

350 Hence we cannot yet truncate the exponential as all terms may be important, 

351 as they only scale negatively for certain values of e. However, in the case e > 

352 then the real field p rescales to the same equation as the complex field 0, 

353 i.e. Ho{p,p) — Ho{(f>,(j>). Eq. ((27| then provides the correct description of 

354 evolution in the infinite population limit. At large but finite N the importance of 

355 terms will correspond to their scaling dimension and hence we can make various 

356 levels of approximation in order to capture these details. The approximate 
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357 Hamiltonian for the real density is obtained from Eq. P^ by considering the 

358 first order correction to Eq. (I27p . given by the n = 2 term from Eq. 



Hi^ f d'^x 



-Dpsj P-—P — p 

2 V 2 Pm , 



(30) 



359 2.6 Stochastic PDEs 

360 A stochastic PDE can sometimes be obtained from the field theory by calculat- 

361 ing the functional derivative of the Action S, as discussed in Section [^^ This 

362 is possible when all terms can be linearised, and we have presented such Hamil- 

363 tonians at various levels of approximation. Other Hamiltonians that cannot be 

364 made bilinear in the fields, such as Eq. (|15p will not yield a stochastic PDE 

365 and will not be considered here. However, the available forms arc the most 

366 important, consisting of: 1. Eq. p9p . the exact equation for the evolution of 

367 the complex field (f>. 2. Eq. (j27p . valid for </> in the infinite population limit, 

368 and we will find also valid for the real density p. 3. Eq. (|30p . the first order 

369 correction at large but finite population for the real density p. 

370 The complete Hamiltonian H{(j), </)) from Eq. p^ is rearranged to Eq. pO]) . 

371 which transforms with noise via Equations (|20|) and pTjl to: 



Hi(jJ,4',v) = dx 



4D0- 



V Pm 



(31) 

372 The two noise fields 771 (x,t) and 772(x',i') are uncorrelated with unit variance, 

373 and form the real and imaginary parts of the noise with the given magnitudes. 

374 Therefore considering the full action S and taking the functional derivative with 

375 respect to (/), we obtain the stochastic PDE for the complex field in the evolution 

376 case: 
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^^^=i5v'0+W^V^0+^m+*V(r?2v/2^)- (32) 

^^ y Pni 

377 This equation is valid at arbitrary population size N, and is an exact represen- 

378 tation in the sense that it captures the finite population size effects correctly (N 

379 appears via the density (/)(x, t) — n{x.,t)/N). The only approximation involved 

380 is the use of continuous time and space, but the same 'amount of individual' 

381 will be moved in a time unit as in the discrete case, with equal variance in both 

382 space and time. 

383 Similarly, the best possible stochastic PDE for the real density in evolution 

384 is obtained from Eq. ((30)) : 



dpix,t) _ n_2 „/„ ^^ , /r, /^„2./„ ^^ , 4p(x,t) 



385 with the added constraint that \/'^p{'x.,t) > — 2p(x, t). The 'mutation noise' 

386 appears because our individuals only 'move' when they reproduce, rather than 

387 diffusing throughout their lifetimes. We have to introduce a cutoff in the gradi- 

388 ent to ensure that the mutation noise remains real. 

389 Finally, we will show that only the square-root noise is 'relevant' in the 

390 infinite population limit, both for the real density p and the complex field </>, so 

391 in this case, (p = p- The stochastic PDE obtained in this case from Eq. ([27]) is: 



^-^V^p(x,^) + J^^^^.(x,t), (34) 

392 which is the equation for a birth/death process in which individuals diffuse in 

393 real space, given by Eq. ([T]). 

394 We complete the analysis with the deduction of the dimensions of using e 

395 and hence justify our claim that d^ — 2 and hence that the real space birth/death 

396 process is equivalent in the infinite limit to the evolution process. 
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1. The full evolution equation for the complex field 0, given by Eq. ^^, 
is dimensionally dominated by either 'square-root' noise or diffusion de- 
pending on d, and therefore we can take the large wavelength, infinite 
population limit and obtain the Hamiltonian for this process given by Eq. 
(HH), hence the stochastic PDE given by Eq. ([M]). 

2. Eq. ([M|) is the same as Eq. ([T]) from Super-Brownian Motion in all dimen- 
sions. Hence we can establish that dc ^ 2 as in Super-Brownian motion, 
and by combining Equations ([M]) and ^^ for the dimensions of the de- 
terministic diffusion term and the square-root noise term respectively, we 
find that e — d. 



3. Therefore the real field described by Eq. (|T5|) can also be described by 
Eq. ((27|) in the large population limit, as the truncation of the extra 
terms present in the real field is now justified, as each is dimensionally 
less important in d > 1 than the terms retained (from Equations (j28|) and 

dMD). 



412 4. Finally, by considering the terms that decrease less quickly with k, the 

413 stochastic PDE representing the leading order (large but finite N) correc- 

414 tion to Eq. ^7}i is given by Eq. ([511)1 . 

415 2.7 Numerical simulation of the stochastic PDEs 

416 We have obtained approximate stochastic PDEs (alternatively, called Langevin 

417 equations) for evolution (Equations ([5^ , ([55)) and ([5^ ) and argued that the 

418 mutation noise term reduces to simple \/7j) noise in the large N limit. We also 

419 have an exact equation for the complex field 4>, which can be related to the real 

420 density distribution. This is done by noting that in operator notation, operators 

421 can be reordered by using the permutation relation so that it is 'normal ordered' 
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422 (i.e. all a^ are to the left of all a). The average of a normal ordered operator 

423 is identical to the average of the same operator with the a^ operators removed 

424 [3]; that is, ((a^)'"/(a)) = (/(a)). We can normal order operators by using the 

425 commutation relation, remove the a^ operators and then take the continuous 

426 limit as before. Therefore: 



(p) = (at a) = (a) = (0), (35) 

(p2) ^ (fltaata) = (a'' {a + a'faa) = {(f>) + (0^). (36) 

427 Continuation to arbitrary moments is straightforward; each only depends on 

428 previous moments. Therefore, if we can calculate (or simulate from an exact 

429 equation) the moments of the complex field cf), we can calculate the moments of 

430 the real density exactly. 

431 The final problem is of iterating a stochastic partial differential equation, 

432 although this is not a trivial task. We use the 'splitting' method presented 

433 in [T71 [IHl [H] to accurately numerically integrate the stochastic PDE, where 

434 possible. This method guarantees to give the correct results for the square-root 

435 noise term, but other terms can cause problems; in particular, (j) ~ is supposed 

436 to be an absorbing state and it should be an attractor. The simplest approach of 

437 using Gaussian noise multiplied by dt directly changes = from an absorbing 

438 state to an unstable steady state, with probability of finding it at any finite 



dt. The numerical solution of the At(j){x, t) ~ W -j y^ (j) H — —ri{x, t) term does 
not seem have an algorithm for finite dt in the literature and so we use ad-hoc 



441 truncation to zero as described in Section 12.41 though it should be noted that 

442 pursuing numerical integration is dangerous in this case. 
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443 2.7.1 Numerical results 

444 We now present the results of simulating the partial differential equations ob- 

445 tained for the diffusion case and the approximated evolution case, and show 

446 that they behave approximately as expected. For comparison purposes, we also 

447 show the behaviour the stochastic PDE obtained from a simple diffusion of N 

448 non- interacting particles (see Appendix |^ : 



dpD{yi,t) 
dt 



= D v' /Od(x, t) + v(??(x, t)^2DpD{^, t)). (37) 



449 Note the presence of the \j on the noise term, ensuring that this noise conserves 

450 particles locally. Figure[2](Left) shows the general behaviour of the distributions 

451 for Eq. p4p . containing only square root noise and corresponding to the 'real 

452 space' case of reproducing and dying particles subject to spatial diffusion. The 

453 Figure demonstrates that square root noise does produce clustering and permits 

454 local and global extinction. However, we find that Eq. (jM]) only captures the 

455 qualitative aspects of the clustering. Figure [2] (Right) shows the (ensemble aver- 

456 aged) Interface Widtlij of the distribution against time, comparing the various 

457 cases. We compare all of the stochastic PDEs we have encountered, and see 

458 that all terms are important quantitatively - the width is not correctly repre- 

459 sented in any approximation, even at this fairly large value of A'^ = 10000. The 

460 diffusion stochastic PDE (Eq. (|37p ) fits the Master Equation solution (Eq. (7) 

461 from [4J) closely. Only diffusion and square root noise can be guaranteed to be 

462 accurate numerical integrations of the corresponding stochastic PDEs, due to 

463 the numerical problems discussed above. The average interface width increases 

464 with time after passing some minimum value in all evolution cases, because the 

465 total population is not conserved. Since we disregard runs in the ensemble av- 



§The Interface Width (n(x)^) — {n{x))'^ is not directly related to the standard deviation 
of the distribution, but rather the distribution is viewed as an interface. The Interface Width 
describes the 'roughness' or deviation of the distribution from a straight line. 
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466 erage where the population becomes extinct the average population size tends 

467 to increase from its initial value[7]. 
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Figure 2: Left: Evolved population distribution for the evolution case using the 
approximation of infinite N, given by Eq. (j34p . Shown is the distribution at 
various times. Note that it allows the distribution to split into multiple clusters 
as the original Master Equation, and local extinction is possible. Qualitatively, 
the model appears accurate. Right: Interface width {n{x)'^) — (ri(x))^ as a 
function of time for the various relevant cases, with initial conditions of all 
population starting at position 0. Plotted are, from the bottom up, '1. Diffusion 
ME': the Master Equation evaluation of Diffusion (Eq. (7) from jflj) which is 
indistinguishable on this plot from the diffusion stochastic PDE from Eq. ([37]). 
'2. Evolution (LE, root n noise only)': given by (Eq. (I34|)). '3. Evolution (LE, 
real only)': the full solution given by Eq. (|32l) with the insistence that the field 
is real (truncation of negative densities to 0). '4. Evolution (LE, full)': the full 
solution (Eq. ([22])) itself. Finally, '5. Evolution (ME)': the Master Equation 
evaluation of evolution (Eq. ([2])). Only the full consideration of the complex 
solution yields the desired behaviour, although the qualitative dynamics of a 
peak are captured. We use D = 0.25 and N{t = 0) = 10000 throughout. 



468 The full complex solution from Eq. (j32p quantitatively captures the dynam- 

469 ics, although it must be admitted that the 'ad-hoc' nature of the process for 

470 discretisation permits us to try different procedures and keep only those that 

471 worked. This successful regime used a truncation of densities under a small 

472 amount chosen specifically for the time-step and total population, and the time- 

473 step was chosen very small. Gaussian numbers were used for the generation of 

474 the noises. 
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475 3 Summary 

476 We found that death and reproduction with mutation in a type space is iden- 

477 ticaUy described in the large scale and population limit as diffusing particles 

478 undergoing birth/death processes, and is therefore described by a super Brown- 

479 ian Motion. This tells us that the critical dimension for the evolution process is 

480 2 in Euclidean space. Hence specialised models such as presented in [T] are es- 

481 sential for considering the distribution of a given phenotype (d < 2). In higher 

482 dimensions d > 2 lineage analysis is sufficient to describe the distribution of 

483 types developing in time, when coupled with the representation of type space. 

484 However, all cases require a microscopic consideration of the underlying process 

485 for calculations at finite N. Our simple Field Theory analysis has provided an 

486 exact description of the problem in the form of a stochastic PDE, Eq. ([5^ . 

487 We found the first order correction to the infinite A^ stochastic PDE, given 

488 by Eq. ([55)1 . This is valid when the total population N is large but finite. Ob- 

489 taining the correct stochastic PDE to represent a microprocess is non-trivial and 

490 mistakes are often made, as discussed in Ref. |16) . and hence a careful deriva- 

491 tion such as ours is very important. Our work brings together previous results 

492 and makes the underlying clustering process in evolution explicit. Field theory 

493 is a tool that permits examination of finite systems and our work discusses the 

494 relevancy of stochastic PDEs in this case. 
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500 A Calculating the Stochastic PDE for Diffusing 

501 particles 

502 The starting point for diffusion is the Action for diffusing and non-interacting particles, 

503 obtained from Eq. 35 in 4' for diffusing interacting particles, by setting the reaction 

504 rate Ao to zero: 



Ai{<f,, 4>)^ f d^x I dt \j,dt<li - D^ v' <t\ 



(38) 



505 Here we have also neglected terms for initial and final conditions. We first convert to 

506 a real density field p using the methods from Section 12.31 by setting (j> = pe"'' and 

507 <j> — e'' . In this case the exponential terms cancel out and we are left, after integration 

508 by parts, with: 

Mp, p)= [ d'^x f dt [pdtp -Dp\7^ p + Dpi^pf] . (39) 



509 The final noise term is linearised using Eq. (|20p with but is of opposite sign, therefore 

510 giving the linearised Action: 

Ad{p, p,r)) = j d^x f dt \j}dtp ~Dp\7^p + \j{r„/2D^)p^ , (40) 

511 which on functional differentiation with respect to p yields Eq. (|37|l . 
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